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1 The relational mode! for database management: version 2 
E. F. Codd 
January 1990 Book 

Publisher: Addison-Wesley Longman Publishing Co., Inc. 

Full text available: Wi pdf(20.61 MB) Additional Information: fall citation, abstract, references, citings, index 

terms, revjew 

From the Preface (See Front Matter for full Preface) 

An important adjunct to precision is a sound theoretical foundation. The relational model is 
solidly based on two parts of mathematics: firstorder predicate logic and the theory of 
relations. This book, however, does not dwell on the theoretical foundations, but rather on 
all the features of the relational model that I now perceive as important for database 
users, and therefore for DBMS vendors. My perceptions result from 20 y ... 

2 Statistical^^ ratio nal transdu ctions and their 
application to human language processinc? 
Fernando Pereira, Michael Riley, Richard Sproat 

March 1994 Proceedings of the workshop on Human Language Technology HLT '94 
Publisher: Association for Computational Linguistics 

Full text available: |j| pdf(685.83 KB) Additional Information: full citation , abstract, references, citings 

We present the concepts of weighted language, transduction and automaton from 
algebraic automata theory as a general framework for describing and implementing 
decoding cascades in speech and language processing. This generality allows us to 
represent uniformly such information sources as pronunciation dictionaries, language 
models and lattices, and to use uniform algorithms for building decoding stages and for 
optimizing and combining them. In particular, a single automata join algorithm ... 

3 Data clustering: a review 
|& A. K. Jain, M. N. Murty, "p/j. Flynn 

^ September 1999 ACM Computing Surveys (CSUR), Volume 31 Issue 3 
Publisher: ACM Press 

Full text available; ' M pdf(636.24 KB) Additional Information: MLcMlQn, abstract, references, citings, index 

terms , review 

Clustering is the unsupervised classification of patterns (observations, data items, or 
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feature vectors) into groups (clusters). The clustering problem has been addressed in 
many contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a 
difficult problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, incremental 
clustering, similarity indices, unsupervised learning 



4 Data mining (DM): Expanding the taxonomies of bibliographic archives with B 
M persistent long-term themes 
Rene Schult, Myra Spiliopoulou 

April 2006 Proceedings of the 2006 ACM symposium on Applied computing SAC '06 
Publisher: ACM Press 

Full text available: ^pdf(210.33 K3) Additional Information: full citation, abstract , references, index terms 

As document collections accummulate over time, some of the discussion subjects in them 
become outfashioned, while new ones emerge. In this paper, we address the challenge of 
finding such emerging and persistent "themes", i.e. subjects that live long enough to be 
incorporated into a taxonomy or ontology describing the document collection. Our method 
is based on similarity-based clustering and cluster label construction and focusses on the 
identification of cluster labels that "survive" cha ... 

Keywords: clustering, labeling, time series 



Automatic expansion of domain-specific lexicons by term categorization H 
Henri Avancini, Alberto Lavelli, Fabrizio Sebastiani, Roberto Zanoli 

May 2006 ACM Transactions on Speech and Language Processing (TSLP), Volume 3 Issue 

1 

Publisher: ACM Press 

Full text available: ^ pdff 589.28 KB) Additional Information: full citation, abstract, references, index terms 

We discuss an approach to the automatic expansion of domain-specific lexicons, that is, to 
the problem of extending, for each c. in a predefined set C = {c v ...,c m } of semantic 

domains, an initial lexicon L' 0 into a larger lexicon L' v Our approach relies on term 

categorization, defined as the task of labeling previ ... 

Keywords: Lexicons, machine learning, text classification 



Accelerating XPath evaluation in ^xRDBMjS Q 
Torsten Grust, Maurice Van Keuien, Jens Teubner 

March 2004 ACM Transactions on Database Systems (TODS), Volume 29 Issue 1 
Publisher: ACM Press 

Full text available: ffifidSZfil.QI KB) Additional information: Ml.PMj.Qri, appendices and suppjements, 
V * abstract , references, cited by. index terms 

This article is a proposal for a database index structure, the XPath accelerator, that has 
been specifically designed to support the evaluation of XPath path expressions. As such, 
the index is capable to support all XPath axes (including ancestor, following, preceding- 
sibling, descendant-or-self, etc.). This feature lets the index stand out among related work 
on XML indexing structures which had a focus on the child and descendant axes only. The 
index has been designed with a close ... 



http://portaUcm.org/resultsxfa 7/9/07 



Results (page 1): +("term text database" or "text database") +search +"new label" ^suggest Page 3 of 7 



Keywords: Main-memory databases, XML, XML indexing, XPath 



7 Labeling images with a computer game 

Luis von Ahn, Laura Dabbish 
^ April 2004 Proceedings of the SIGCHI conference on Human factors in computing 
systems CHI '04 

Publisher: ACM Press 

Full text available: ttpd«493.67 KB) Additional Information: full citation, abstain references , citings, index 
m terms 

We introduce a new interactive system: a game that is fun and can be used to create 
valuable output. When people play the game they help determine the contents of images 
by providing meaningful labels for them. If the game is played as much as popular online 
games, we estimate that most images on the Web can be labeled in a few months. Having 
proper labels associated with each image on the Web would allow for more accurate image 
search, improve the accessibility of sites (by providing descriptio ... 

Keywords: World Wide Web, distributed knowledge acquisition, image labeling, online 
games 



8 Papers from the 2003 i nternational conference on D atabase theory : I ncrementa l 
M validation of XML documents 
™ Andrey Balmin, Yannis Papakonstantinou, Victor Vianu 

December 2004 ACM Transactions on Database Systems (TODS), Volume 29 Issue 4 

Publisher: ACM Press 

Full text available* fil l pd r ( 576 95 ks) Additional Information: full citation, abstract, references, citings, index 

" ™ ~ teems 

We investigate the incremental validation of XML documents with respect to DTDs, 
specialized DTDs, and XML Schemas, under updates consisting of element tag renamings, 
insertions, and deletions. DTDs are modeled as extended context-free grammars. 
"Specialized DTDs" allow the decoupling of element types from element tags. XML 
Schemas are abstracted as specialized DTDs with limitations on the type assignment. For 
DTDs and XML Schemas, we exhibit an 0(m log /?) incremental valida ... 

Keywords: Update, XML, validation 



9 Mode transforma tio n s for visio n: W ebl nS ight:: mak ing w eb ima g es access ible 
^ Jeffrey P. Bigham, Ryan S. Kaminsky, Richard E. Ladner, Oscar M. Danielsson7Gordon L. 
^ Hempton 

October 2006 Proceedings of the 8th international ACM SIGACCESS conference on 

Computers and accessibility Assets v 06 
Publisher: ACM Press 

Full text available: *P) pdff2.31 MS) Additional Information: full citation , abstract, references , cited bv. index 

terms 

Images without alternative text are a barrier to equal web access for blind users. To 
illustrate the problem, we conducted a series of studies that conclusively show that a large 
fraction of significant images have no alternative text. To ameliorate this problem, we 
introduce WeblnSight, a system that automatically creates and inserts alternative text into 
web pages on-the-fly. To formulate alternative text for images, we present three labeling 
modules based on web context analysis, enhanced opt ... 

Keywords: optical character recognition, transformation proxy, web accessibility, web 
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10 T ext an al ysis an d e xtraction: Topic se g mentation o f message hierarchies for indexin g Q 

and navigation support 
^ Jong Wook Kim, K. Selguk Candan, Mehmet E. Donderler 

May 2005 Proceedings of the 14th international conference on World Wide Web 
WWW '05 

Publisher: ACM Press 

Full text available* 11|pdf{ 333 81 KB) Addit ' onal Information: full citation, abstract, references, citings , index 
' * " * terms 

Message hierarchies in web discussion boards grow with new postings. Threads of 
messages evolve as new postings focus within or diverge from the original themes of the 
threads. Thus, just by investigating the subject headings or contents of earlier postings in 
a message thread, one may not be able to guess the contents of the later postings. The 
resulting navigation problem is further compounded for blind users who need the help of a 
screen reader program that can provide only a linear re ... 

Keywords: assistive technology for blind users, discussion boards, navigational aid, 
segmentation 



11 Fast Kernel Classifiers with Online and Active Learning Q 
Antoine Bordes, Seyda Ertekin, Jason Weston, L6on Bottou 
December 2005 The Journal of Machine Learning Research, Volume 6 
Publisher: MIT Press 

Full text available: pdf(577.37 KB) Additional Information: full citation, abstract 

Very high dimensional learning systems become theoretically possible when training 
examples are abundant. The computing cost then becomes the limiting factor. Any 
efficient learning algorithm should at least take a brief look at each example. But should 
all examples be given equal attentionTThis contribution proposes an empirical answer. We 
first present an online SVM algorithm based on this premise. LASVM yields competitive 
misclassification rates after a single pass over the training examples, ... 



Learning c lassifiers: U s ing urls and table layout for web classification tasks Q 

• L. K. Shih, D. R. Karger 
May 2004 Proceedings of the 13th international conference on World Wide Web 
WWW 04 

Publisher: ACM Press 

Full text available: 'ffl pdfOS? 43 KS^ Additional Information: full citation, abstract, references, citings . Index 
. lsiJ H terms 

We propose new features and algorithms for automating Web-page classification tasks 
such as content recommendation and ad blocking. We show that the automated 
classification of Web pages can be much improved if, instead of looking at their textual 
content, we consider each links's URL and the visual placement of those links on a 
referring page. These features are unusual: rather than being scalar measurements like 
word counts they are tree structured— describing the position of the item ... 

Keywords: classification, news recommendation, tree structures, web applications 
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Mira Dontcheva, Steven M. Drucker, Geraldine Wade, David Salesin, Michael F. Cohen 
October 2006 Proceedings of the 19th annual ACM symposium on User interface 
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software and technology UIST f 06 
Publisher: ACM Press 

Full text available: ^pdf(676.18 K3) Additional Information: full citation, abstract, references , index terms 

We describe a system, implemented as a browser extension, that enables users to quickly 
and easily collect, view, and share personal Web content. Our system employs a novel 
interaction model, which allows a user to specify webpage extraction patterns by 
interactively selecting webpage elements and applying these patterns to automatically 
collect similar content. Further, we present a technique for creating visual summaries of 
the collected information by combining user labeling with predefined I ... 

Keywords: information management, template-based summarization, webpage 
extraction patterns 



14 Containment and equivalence for a fragment of XPath 

Gerome Miklau, Dan Suciu 
^ January 2004 Journal of the ACM (JACM), volume si issue l 

Publisher: ACM Press 

Full text available: f§ pdf{367.27 KB) Additional Information: Mcitatjon., abstract, mferences, eatings, index 
* ™ A "* terms, review 

XPath is a language for navigating an XML document and selecting a set of element nodes. 
XPath expressions are used to query XML data, describe key constraints, express 
transformations, and reference elements in remote documents. This article studies the 
containment and equivalence problems for a fragment of the XPath query language, with 
applications in all these contexts.In particular, we study a class of XPath queries that 
contain branching, label wildcards and can express descendant relation ... 

Keywords: Tree pattern matching, XPath expressions, query containment, query 
equivalence 



15 Workload optimization: Efficient pattern mining on shared memory systems: | 
implications for chip multiprocessor architectures 

™ Gregory Buehrer, Yen-Kuang Chen, Srinivasan Parthasarathy, Anthony Nguyen, Amol 
Ghoting, Daehyun Kim 

October 2006 Proceedings of the 2006 workshop on Memory system performance and 

correctness MSPC '06 
Publisher: ACM Press 

Full text available: ^.pdf(232 : 73 KB) Additional Information: full .citation, abstract, references, index .terms 

Frequent pattern mining is a fundamental data mining process which has practical 
applications ranging from market basket data analysis to web link analysis. In this work, 
we show that state-of-the-art frequent pattern mining applications are inefficient when 
executing on a shared memory multiprocessor system, due primarily to poor utilization of 
the memory hierarchy. To improve the efficiency of these applications, we explore memory 
performance improvements, task partitioning strategies, and tas ... 

16 Research sessions: Research 19: Information integration: Meaningful labeling of j 
integrated query interfaces 

Eduard C. Dragut, Clement Yu, Weiyi Neng 

September 2006 Proceedings of the 32nd international conference on Very large data 

bases VLDB '06 
Publisher: VLDB Endowment 

Full text available: B ^.pdf(3,36 MB) Additional Information: fulj. cjtation, abstract, references, iodSKJsilIDfi 
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The contents of Web databases are accessed through queries formulated on complex user 
interfaces. In many domains of interest (e.g. Auto) users are interested in obtaining 
information from alternative sources. Thus, they have to access many individual Web 
databases via query interfaces. We aim to construct automatically a well-designed query 
interface that integrates a set of interfaces in the same domain. This will permit users to 
access information uniformly from multiple sources. Earlier rese ... 

17 Learning and performing by expioration: label quality measured by latent semantic H 
M> analysis 
^ Rodolfo Soto 

May 1999 Proceedings of the SIGCHI conference on Human factors in computing 
systems: the CHI is the limit CHI 99 
, Publisher: ACM Press 

Full text available: f& pdfM.07 MB) Additional Information: full citation, abstract, references, citings, Index 
' m terms 

Models of learning and performing by exploration assume that the semantic similarity 
between task descriptions and labels on display objects (e.g., menus, tool bars) controls in 
part the users search strategies. Nevertheless, none of the models has an objective way to 
compute semantic similarity. In this study, Latent Semantic Analysis (LSA) was used to 
compute semantic similarity between task descriptions and labels in an applications menu 
system. Participants performed twelve tasks ... 

Keywords: cognitive models, label-following strategy, latent semantic analysis, learning 
by exploration, semantic similarity, usability analysis 



1 8 Pro j ect APRI L: a p ro gress report 
Robin Haigh, Geoffrey Sampson, Eric Atwell 

June 1988 Proceedings of the 26th annual meeting on Association for Computational 
Linguistics 

Publisher: Association for Computational Linguistics 

Full text available: Mpdf{766,32 KB) 

jW Additional Information: full citation , abstract, references 

qB! Publisher Site 

Parsing techniques based on rules defining grammaticality are difficult to use with 
authentic inputs, which are often grammatically messy. Instead, the APRIL system seeks a 
labelled tree structure which maximizes a numerical measure of conformity to statistical 
norms derived from a sample of parsed text. No distinction between legal and illegal trees 
arises: any labelled tree has a value. Because the search space is large and has an 
irregular geometry, APRIL seeks the best tree using simulated a ... 

19 Image Categorization by Learning and Reasoning with Regions 
Yixin Chen, James Z. Wang 

December 2004 The Journal of Machine Learning Research, Volume 5 
Publisher: MIT Press 

Full text available: ^pdll.31MBi Additional Information: Mediation, abstract, references, citings 

Designing computer programs to automatically categorize images using low-level features 
is a challenging research topic in computer vision. In this paper, we present a new 
learning technique, which extends Multiple-Instance Learning (MIL), and its application to 
the problem of region-based image categorization. Images are viewed as bags, each of 
which contains a number of instances corresponding to regions obtained from image 
segmentation. The standard MIL problem assumes that a bag is labeled p ... 
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^ Sudarshan S. Chawathe, Hector Garcia-Molina 

^ June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 
conference on Management of data SIGMOD '97, Volume 26 issue 2 
Publisher: ACM Press 

Full text available* ffi pdff1.67 MB) Additional Information: MfiitefifiD, abstEid. ffifeESOfiSS. citings, index 
* ^ terms 

Detecting changes by comparing data snapshots is an important requirement for 
difference queries, active databases, and version and configuration management. In this 
paper we focus on detecting meaningful changes in hierarchically structured data, such as 
nested-object data. This problem is much more challenging than the corresponding one for 
relational or flat-file data. In order to describe changes better, we base our work not just 
on the traditional "atomic" insert, delete, u ... 
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41 Graph mining: Laws, generators, and algorithms □ 
^ Deepayan Chakrabarti, Christos Faloutsos 
^ June 2006 ACM Computing Surveys (CSUR), volume 38 issue l 
Publisher: ACM Press 

Full text available: '|| pdff910.68 KB) Additional Information: full citation, abstract, references, index terms 

How does the Web look? How could we tell an abnormal social network from a normal 
one? These and similar questions are important in many fields where the data can 
intuitively be cast as a graph; examples range from computer networks to sociology to 
biology and many more. Indeed, any M : N relation in database terminology can be 
represented as a graph. A lot of these questions boil down to the following: "How can we 
generate synthetic but realistic graphs?" To answer thi ... 



Keywords: Generators, graphs, patterns, social networks 



42 User studies: When will information retrieval be "good enough"? Q 
James Allan, Ben Carterette, Joshua Lewis 

^ August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 
Publisher: ACM Press 

Full text available: "tl|£d^9&Q3 KB) Additional Information: Ml citation, abstract, references, citings, Index 
" ****** terms 

We describe a user study that examined the relationship between the quality of an 
Information Retrieval system and the effectiveness of its users in performing a task. The 
task involves finding answer facets of questions pertaining to a collection of newswire 
documents over a six month period. We artificially created sets of ranked lists at 
increasing levels of quality by blending the output of a state-of-the-art retrieval system 
with truth data created by annotators. Subjects performed the task ... 

Keywords: information retrieval, passage retrieval, performance evaluation, user study 

43 The complexity of acyclic conjunctive queries Q 
jM, Georg Gottlob, Nicola Leone, Francesco Scarcello 

^ May 2001 Journal of the ACM (JACM), Volume 48 issue 3 
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Publisher: ACM Press 

Full text available: fft pdff 566.16 KB) Additional information: full citation , ab*ract, references , citings, Index 

terms, review 

This paper deals with the evaluation of acyclic Boolean conjunctive queries in relational 
databases. By well-known results of Yannakakis[1981], this problem is solvable in 
polynomial time; its precise complexity/ however, has not been pinpointed so far. We 
show that the problem of evaluating acyclic Boolean conjunctive queries is complete for 
LOGCFL, the class of decision problems that are logspace-reducible to a context-free 
language. Since LOGCFL is contained in AC1 and NC2, the eva ... 

Keywords: CSP, LOGCFL, acyclic hypergraph, algorithm, bounded treewidth, conjunctive 
query, constraint, constraint satisfaction problem, database theory, degree of cyclicity, 
hinge, join tree, parallel algorithm, query containment, qury-idth, subsumption, tree query 



44 Hashing by proximity to process duplicates in spatial databases Q 
M± Walid G. Aref, Hanan Samet 

November 1994 Proceedings of the third international conference on Information and 

knowledge management CIKM '94 
Publisher: ACM Press 

Full text available: Wi pdf(9$?.84 KB) Additional Information: full citation, abstract, references, citings, index 
' ™ " " terms 

In a spatial database, an object may extend arbitrarily in space. As a result, many spatial 
data structures (e.g., the quadtree, the cell tree, the R+-tree) represent an object by 
partitioning it into multiple, yet simple, pieces, each of which is stored separately inside 
the data structure. Many operations on these data structures are likely to produce 
duplicate results because of the multiplicity of object pieces. A novel approach for 
duplicate processing based on pro ... 

45 Text categorization: Using asymmetric distributions to improve text classifier Q 
A probability estimates 

^ Paul N. Bennett 

July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 

Research and development in informaion retrieval SIGIR a 03 
Publisher: ACM Press 

Full text available: WfidffiHLI 97 KB) Addltional Information: MLqMlqd., abstract, references, citjngs, index 

terms 

Text classifiers that give probability estimates are more readily applicable in a variety of 
scenarios. For example, rather than choosing one set decision threshold, they can be used 
in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost 
function dynamically chosen at prediction time. However, the quality of the probability 
estimates is crucial. We review a variety of standard approaches to converting scores (and 
poor probability estimates) from text classifi ... 

Keywords: active learning, classifier combination, cost-sensitive learning, text 
classification 



Domain-independent data cleaning via analysis of entity-relationship graph Q 
Dmitri V. Kalashnikov, Sharad Mehrotra 

June 2006 ACM Transactions on Database Systems (TODS), volume 31 issue 2 
Publisher: ACM Press 

Full text available: ftaAgXZL .MB) Additional Information: full citation, appendices and supplements . 
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abstract, references, index terms 

In this article, we address the problem of reference disambiguation. Specifically, we 
consider a situation where entities in the database are referred to using descriptions (e.g., 
a set of instantiated attributes). The objective of reference disambiguation is to identify 
the unique entity to which each description corresponds. The key difference between the 
approach we propose (called RelDC) and the traditional techniques is that RelDC analyzes 
not only object features but also inter-obje ... 

Keywords: Connection strength, RelDC, data cleaning, entity resolution, graph analysis, 
reference disambiguation, relationship analysis 



47 Logic and logic programming 
J. A. Robinson 

March 1992 Communications of the ACM, Volume 35 Issue 3 
Publisher: ACM Press 

Full text available: ^ pdf(6.56 MB) Additional Information: full citation , references, citing s, index terms 



Keywords: unification 



48 integer programming vs. expert systems: an experimental comparison Q 
Vasant Dhar, Nicky Ranganathan 

March 1990 Communications of the ACM, volume 33 issue 3 
Publisher: ACM Press 

Full text available: l|pdf£L4g MB) Additional Information: full citation, abstract , references , citings , index 
^ v terms, review 

Expert system and integer programming formulations of an NP-complete constraint 
satisfaction problem are contrasted in terms of performance, ability to encode complex 
preferences, control of reasoning, and supporting incremental modification of solutions in 
response to changing input data. 

49 Formal models: Adapting ranking SVM to document retrieval Q 
^jfc Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, Hsiao-Wuen Hon 

August 2006 Proceedings of the 29th annual international ACM SIGIR conference on 

Research and development in information retrieval SIGIR v 06 
Publisher: ACM Press 

Full text available: ||| pdf(402.44 K3) Additional Information: full citation, abstract, references , index terms 

The paper is concerned with applying learning to rank to document retrieval. Ranking SVM 
is a typical method of learning to rank. We point out that there are two factors one must 
consider when applying Ranking SVM, in general a "learning to rank" method, to document 
retrieval. First, correctly ranking documents on the top of the result list is crucial for an 
Information Retrieval system. One must conduct training in a way that such ranked results 
are accurate. Second, the number of relevant docu ... 

Keywords: information retrieval, loss function, ranking SVM 



Symbolic evaluation and the global value graph 
John H. Reif, Harry R. Lewis 

January 1977 Proceedings of the 4th ACM SIGACT-SIG PLAN symposium on Principles 
of programming languages POPL "77 
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Publisher: ACM Press 

Full text available: ^ pdfi1.34 MB) Additional Information: full citation , abstract, references, citings 

This paper is concerned with difficult global flow problems which require the symbolic 
evaluation of programs. We use, as is common in global flow analysis, a model in which 
the expressions computed are specified, but the flow of control is indicated only by a 
directed graph whose nodes are blocks of assignment statements. We show that if such a 
program model is interpreted in the domain of integer arithmetic then many natural global 
flow problems are unsolvable. We then develop a direct (non-it ... 

51 On the optimal nesting order for computing N-relationa! joins 

^ Toshihide Ibaraki, f iko Kameda 

^ September 1984 ACM Transactions on Database Systems (TODS), volume 9 issue 3 
Publisher: ACM Press 

Full text available - f 3 ?! pd» 1 39 MB) Additional Information: full citation, abstract, references, citings, Index 

terms , review 

Using the nested loops method, this paper addresses the problem of minimizing the 
number of page fetches necessary to evaluate a given query to a relational database. We 
first propose a data structure whereby the number of page fetches required for query 
evaluation is substantially reduced and then derive a formula for the expected number of 
page fetches. An optimal solution to our problem is the nesting order of relations in the 
evaluation program, which minimizes the number of page fetche ... 

52 Ultraconservative online algorithms for multiclass problems 
Koby Crammer, Yoram Singer 

March 2003 The Journal of Machine Learning Research, volume 3 
Publisher: MIT Press 

Full text available: ^|| pdfl255 98 KB) Additional Information: full citation , abstract, citings, index terms 

In this paper we study a paradigm to generalize online classification algorithms for binary 
classification problems to multiclass problems. The particular hypotheses we investigate 
maintain one prototype vector per class. Given an input instance, a multiclass hypothesis 
computes a similarity-score between each prototype and the input instance and sets the 
predicted label to be the index of the prototype achieving the highest similarity. To design 
and analyze the learning algorithms in this paper ... 

53 Research session 4: data integration & interoperability: Computing cores for data 
^ Georg Gottlob 

June 2005 Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART 

symposium on Principles of database systems PODS '05 
Publisher: ACM Press 

Full text available: ^ pdf(239.10 KB) Additional Information: full citation, abstract, references, citings 

Data Exchange Is the problem of Inserting data structured under a source schema into a 
target schema of different structure (possibly with integrity constraints), while reflecting 
the source data as accurately as possible. We study computational issues related to data 
exchange in the setting of Fagin, Kolaitis, and Popa(PODS'03). We use the technique of 
hypertree decompositions to derive improved algorithms for computing the core of a 
relational instance with labeled nulls, a problem we show to ... 
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A query ianquaqe and optimization techniques for unstructured data 
Peter Buneman, Susan Davidson, Gerd Hillebrand, Dan Suciu 

June 1996 ACM SIGMOD Record , Proceedings of the 1996 ACM SIGMOD international 

conference on Management of data SIGMOD '96, Volume 25 Issue 2 
Publisher: ACM Press 
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Full text available: MB) Additional Information: Ml.citation, abstract, mfenences, citings, index 

terms 

A new kind of data model has recently emerged in which the database is not constrained 
by a conventional schema. Systems like ACeDB, which has become very popular with 
biologists, and the recent Tsimmis proposal for data integration organize data in tree-like 
structures whose components can be used equally well to represent sets and tuples. Such 
structures allow great flexibility y in data representation. What query language is 
appropriate for such structures? Here we propose a simple language Un ... 

55 iropiovM.Param^ 

Maximum Compatible Tr e e P r oblems 
Vincent Berry, Francois Nicolas 

July 2006 IEEE/ACM Transactions on Computational Biology and Bioinformatics 

(TCBB), Volume 3 Issue 3 

Publisher: IEEE Computer Society Press 

Full text available: ^pdf(349 36 KB) Additional Information: full citation, abstract references , index terms 

Given a set of evolutionary trees on a same set of taxa, the maximum agreement subtree 
problem (MAST), respectively, maximum compatible tree problem (MCT), consists of 
finding a largest subset of taxa such that all input trees restricted to these taxa are 
isomorphic, respectively compatible. These problems have several applications in 
phylogenetics such as the computation of a consensus of phylogenies obtained from 
different data sets, the identification of species subjected to horizontal gene t ... 

Keywords: Phylogenetics, algorithms, consensus, pattern matching, trees, compatibility, 
fixed-parameter tractability. 



56 Research sessions: XML PubSub and indexing: Incremental maintenance of XML Q 
structural indexes 

^ Ke Yi, Hao He, Ioana Stanoi, Jun Yang 

June 2004 Proceedings of the 2004 ACM SIGMOD international conference on 

Management of data SIGMOD '04 
Publisher: ACM Press 

Full text available: ^ pdR26Q»24 KB) Additional Information: full citation, abstract, references, citings 

Increasing popularity of XML in recent years has generated much interest in query 
processing over graph-structured data. To support efficient evaluation of path expressions, 
many structural indexes have been proposed. The most popular ones are the 1-index, 
based on the notion of graph bisimilarity, and the recently proposed A(/c)-index, based on 
the notion of local similarity to provide a trade-off between index size and query answering 
power. For these indexes to be practical, we need eff ... 

57 LQDCLPaPfiCs: multimo dal inte r action: Multi modal ne w vocabulary reco g nition throu gh Q 

<M speech and handwriting in a whiteboard scheduling application 
^ Edward C. Kaiser 

January 2005 Proceedings of the 10th international conference on Intelligent user 

interfaces IUI v 05 
Publisher: ACM Press 

Full text available: fB pdff 428.63 KB) Additional Information: full citation, abstract, references , citings , index 
■ ^ r 1 terms 

Our goal is to automatically recognize and enroll new vocabulary in a multimodal interface. 
To accomplish this our technique aims to leverage the mutually disambiguating aspects of 
co -referenced, co-temporal handwriting and speech. The co-referenced semantics are 
spatially and temporally determined by our multimodal interface for schedule chart 
creation. This paper motivates and describes our technique for recognizing out-of- 
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vocabulary (OOV) terms and enrolling them dynamically in the system. We ... 
Keywords: multimodal interaction, mutual disambiguation, vocabulary learning 



58 Oblivious data structures: applications to cryptography 

Danieie Micciancio 

^ May 1997 Proceedings of the twenty-ninth annual ACM symposium on Theory of 
computing STOC '97 
Publisher: ACM Press 

Full text available: ^MUA9..M8) Additional Information: M£i&tiaQ, mter^nces, .cffings, index.terms 



59 Direct 

^ Rakesh Agrawai, Shaul Dar, H. V. Jagadish 

^ September 1990 ACM Transactions on Database Systems (TODS), volume is issue 3 
Publisher: ACM Press 

Full text available* ^ pdf[2 5 8 MB) Additional Information: full citation, abstract, references, citings, index 

terms 

We present new algorithms for computing transitive closure of large database relations. 
Unlike iterative algorithms, such as the seminaive and logarithmic algorithms, the 
termination of our algorithms does not depend on the length of paths in the underlying 
graph (hence the name direct algorithms). Besides reachability computations, the 
proposed algorithms can also be used for solving path problems. We discuss issues related 
to the efficient implementation of these algorith ... 

Keywords: deductive databases, query processing, transitive closure 

60 Representing and quer ying XML with incomplete information | 
^ Serge Abiteboul, Luc Segoufin, Victor Vianu 

^ March 2006 ACM Transactions on Database Systems (TODS), Volume 31 Issue 1 
Publisher: ACM Press 

Full text available: ^i)dl5M-60.KB) Additional Information: fulj.citation, abstract, references, index terms 

We study the representation and querying of XML with incomplete information. We 
consider a simple model for XML data and their DTDs, a very simple query language, and a 
representation system for incomplete information in the spirit of the representations 
systems developed by Imielinski and Lipski [1984] for relational databases. In the scenario 
we consider, the incomplete information about an XML document is continuously enriched 
by successive queries to the document. We show that our representa ... 

Keywords: Incomplete information, XML 
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You may want to try an Advanced Search for additional options. 

Please review the Quick Tips below or for more information see the Search Tips . 

Quick Tips 

• Enter your search terms in lower case with a space between the terms. 

sales offices 

You can also enter a full question or concept in plain language. 
Where are the sales offices? 

• Capitalize proper nouns to search for specific people, places, or 
products. 

John Colter, Netscape Navigator 

• Enclose a phrase in double quotes to search for that exact phrase. 

"museum of natural history" "museum of modern art" 

• Narrow your searches by using a + if a search term must appear on a 
page. 

museum +art 

• Exclude pages by using a - if a search term must not appear on a page. 

museum -Paris 

Combine these techniques to create a specific search query. The better 
your description of the information you want, the more relevant your 
results will be. 

museum +"natural history" dinosaur -Chicago 
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Research session: X ML query p rocess ing #2: From region enco din g to extended B 
dewev: on efficient processing of XML twig pattern matching 
Jiaheng Lu, Tok Wang Ling, Chee-Yong Chan, Ting Chen 

August 2005 Proceedings of the 31st international conference on Very large data 

bases VLDB '05 
Publisher: VLDB Endowment 

Additional Information: full citation, abstract, references, citings, index 
terms 



Full text available: f|.pdf(353J4 Kg) 



Finding all the occurrences of a twig pattern in an XML database is a core operation for 
efficient evaluation of XML queries. A number of algorithms have been proposed to process 
a twig query based on region encoding labeling scheme. While region encoding supports 
efficient determination of structural relationship between two elements, we observe that 
the information within a single label is very limited. In this paper, we propose a new 
labeling scheme, called extended Dewey. 

The relational model for database management: version 2 j 
E. F. Codd 
January 1990 Book 

Publisher: Addison-Wesley Longman Publishing Co., Inc. 

Additional Information: full citation , abstract, references , citings , index 
terms, review 



Full text available: Wi?dfC2.8-.Sl.MB)i 



From the Preface (See Front Matter for full Preface) 

An important adjunct to precision is a sound theoretical foundation. The relational model is 
solidly based on two parts of mathematics: firstorder predicate logic and the theory of 
relations. This book, however, does not dwell on the theoretical foundations, but rather on 
all the features of the relational model that I now perceive as important for database 
users, and therefore for DBMS vendors. My perceptions result from 20 y ... 

Accelerating XPath evaluation in any RDBMS 
Torsten Grust, Maurice Van Keulen, Jens Teubner 

March 2004 ACM Transactions on Database Systems (TODS), volume 29 issue l 
Publisher: ACM Press 

Additional Information: full citation , appendices and supplements. 
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This article is a proposal for a database index structure, the XPath accelerator, that has 
been specifically designed to support the evaluation of XPath path expressions. As such, 
the index is capable to support all XPath axes (including ancestor, following, preceding- 
sibling, descendant-or-self, etc.). This feature lets the index stand out among related work 
on XML indexing structures which had a focus on the child and descendant axes only. The 
index has been designed with a close ... 

Keywords: Main-memory databases, XML, XML indexing, XPath 



Papers from the 2003 international conference on Database theory: Incremental 

validation of XML documents 

Andrey Balmin, Yannis Papakonstantinou, Victor Vianu 

December 2004 ACM Transactions on Database Systems (TODS), volume 29 issue 4 
Publisher: ACM Press 

Full text available- flpd* 676 95 KB) AdditionaI Information: full citation, abstract, references, citings, Index 
' m ""' * terms 

We investigate the incremental validation of XML documents with respect to DTDs, 
specialized DTDs, and XML Schemas, under updates consisting of element tag renamings, 
insertions, and deletions. DTDs are modeled as extended context-free grammars. 
"Specialized DTDs" allow the decoupling of element types from element tags. XML 
Schemas are abstracted as specialized DTDs with limitations on the type assignment. For 
DTDs and XML Schemas, we exhibit an 0(m log n) incremental valida ... 

Keywords: Update, XML, validation 



5 Containmen t and equivalence for a fragment of XPath 
M± Gerome Miklau, Dan Suciu 

January 2004 Journal of the ACM (JACM), Volume 51 Issue 1 

Publisher: ACM Press 

Full text available- fR |pdf(367 27 KB) Additional Information: full citation, abstract, references, citings, index 
" terms, review 

XPath Is a language for navigating an XML document and selecting a set of element nodes. 
XPath expressions are used to query XML data, describe key constraints, express 
transformations, and reference elements in remote documents. This article studies the 
containment and equivalence problems for a fragment of the XPath query language, with 
applications in all these contexts.In particular, we study a class of XPath queries that 
contain branching, label wildcards and can express descendant relation ... 

Keywords: Tree pattern matching, XPath expressions, query containment, query 
equivalence 



Labe lin g images wi th a co m pute r game 
Luis von Ahn, Laura Dabbish 

April 2004 Proceedings of the SIGCHI conference on Human factors in computing 
systems CHI '04 

Publisher: ACM Press 

Full text available: pdff 493.67 KB) Additional Information: MLQMiOQ, abstract, references, citings, index 

terms 

We introduce a new interactive system: a game that is fun and can be used to create 
valuable output. When people play the game they help determine the contents of images 
by providing meaningful labels for them. If the game is played as much as popular online 
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games, we estimate that most images on the Web can be labeled in a few months. Having 
proper labels associated with each image on the Web would allow for more accurate image 
search, improve the accessibility of sites (by providing descriptio ... 

Keywords: World Wide Web, distributed knowledge acquisition, image labeling, online 
games 
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Learning classifiers: Using uris and table layout for web classification tasks 
L. K. Shih, D. R. Karger 

May 2004 Proceedings of the 13th international conference on World Wide Web 
WWW 04 

Publisher: ACM Press 

Full text available- ^ pdf(357 43 KB^ Additional Information: full citation , abstract, references, citings, Index 

' teiais 

We propose new features and algorithms for automating Web-page classification tasks 
such as content recommendation and ad blocking. We show that the automated 
classification of Web pages can be much improved if, instead of looking at their textual 
content, we consider each links's URL and the visual placement of those links on a 
referring page. These features are unusual: rather than being scalar measurements like 
word counts they are tree structured— describing the position of the item ... 

Keywords: classification, news recommendation, tree structures, web applications 



8 Research 
M system 

W: Yi Chen, Susan B. Davidson, Yifeng Zheng 

June 2004 Proceedings of the 2004 ACM SIGMOD international conference on 

Management of data SIGMOD '04 
Publisher: ACM Press 

Full text available: |j§pdf(1 79.44 KB) Additional Information: full citation, abstract, references, citings 

We present BLAS, a Bi-LAbeling based System, for efficiently processing complex XPath 
queries over XML data. BLAS uses P-labeling to process queries involving consecutive child 
axes, and D-labeling to process queries involving descendant axes traversal. The XML data 
is stored in labeled form, and indexed to optimize descendent axis traversals. Three 
algorithms are presented for translating complex XPath queries to SQL expressions, and 
two alternate query engines are provided. Experimental result ... 



9 Research sessions: Research 19: Information integration; Meaningful labeling of 
integrated ^ 

Eduard C. Dragut, Clement Yu, Weiyi Meng 

September 2006 Proceedings of the 32nd international conference on Very large data 
bases VLDB '06 

Publisher: VLDB Endowment 

Full text available: |||pdfQ..36 MBA Additional Information: full citation, abstract, references, index terms 

The contents of Web databases are accessed through queries formulated on complex user 
interfaces. In many domains of interest (e.g. Auto) users are interested in obtaining 
information from alternative sources. Thus, they have to access many individual Web 
databases via query interfaces. We aim to construct automatically a well-designed query 
interface that integrates a set of interfaces in the same domain. This will permit users to 
access information uniformly from multiple sources. Earlier rese ... 

10 invited talk: The Lixto data extraction project: back and forth between theory and 



http ://portal .acm. org/results. cfm?coll=ACM&dl= ACM&CFID=23 42 1 877&CFTOKEN= 1 946 . . . 7/9/07 



Results (page 1): +( ,, term text database" or "text database") ^search +"new label" ^translate Page 4 of 7 



^fr practice 

^ Georg Gottlob, Christoph Koch, Robert Baumgartner, Marcus Herzog, Sergio Flesca 
June 2004 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART 
symposium on Principles of database systems PODS '04 

Publisher: ACM Press 

Full text available: ^p df(4 30.70 KB) Additional Information: full citation, abstract, references, citings 

We present the Lixto project, which is both a research project in database theory and a 
commercial enterprise that develops Web data extraction (wrapping) and Web service 
definition software. We discuss the project's main motivations and ideas, in particular the 
use of a logic-based framework for wrapping. Then we present theoretical results on 
monadic datalog over trees and on Elog, its close relative which is used as the internal 
wrapper language in the Lixto system. These results include both ... 




11 Subtext: uncovering the simplicity of programming 
j&L Jonathan Edwards 

^ October 2005 ACM SIGPLAN Notices, Proceedings of the 20th annual ACM SIGPLAN 
conference on Object oriented programming, systems, languages, and 
applications OOPSLA '05, volume 40 issue 10 
Publisher: ACM Press 

Full text available - "Pipdf'293 1'-? KB) Addit ' onaI Information: full citation, abstract, references, citings, index 

terms 

Representing programs as text strings makes programming harder then it has to be. The 
source text of a program is far removed from its behavior. Bridging this conceptual gulf is 
what makes programming so inhumanly difficult - we are not compilers. Subtext is a new 
medium in which the representation of a program is the same thing as its execution. Like 
a spreadsheet, a program is visible and alive, constantly executing even as it is edited. 
Program edits are coherent semantic transformati ... 

Keywords: copying, non-textual programming, prototypes, visual programming 



12 Meaningful change detection in structured data 
^ Sudarshan S. Chawathe, Hector Garcia-Molina 

June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 

conference on Management of data SIGMOD '97, volume 26 issue 2 
Publisher: ACM Press 

Full text available: fi pdf(1 .67 MB) Additlonal Information: full. citation, abstract, references, citings, index 
A terms 

Detecting changes by comparing data snapshots is an important requirement for 
difference queries, active databases, and version and configuration management. In this 
paper we focus on detecting meaningful changes in hierarchically structured data, such as 
nested-object data. This problem is much more challenging than the corresponding one for 
relational or flat-file data. In order to describe changes better, we base our work not just 
on the traditional "atomic" insert, delete, u ... 

1 3 Ihe. theoiy of .parsjn 

Alfred V. Aho, Jeffrey D. ullman 
January 1972 Book 

Publisher: Prentice-Hall, Inc. 

Full text available: jj gj pdf(98.28 MB) Additional Information: full citation, abstract, references, cjtings, jndex 

terms 

From volume 1 Preface (See Front Matter for full Preface) 
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This book is intended for a one or two semester course in compiling theory at the senior or 
graduate level. It is a theoretically oriented treatment of a practical subject. Our 
motivation for making it so is threefold. 

(1) In an area as rapidly changing as Computer Science, sound pedagogy demands that 
courses emphasize ideas, rather than implementation details. It is our hope that the 
algorithms and concepts presen ... 

1 4 A . dece ntral i zed .m j 
^ Andrew C. Myers, Barbara Liskov 

V October 1997 ACM SIGOPS Operating Systems Review , Proceedings of the sixteenth 
ACM symposium on Operating systems principles SOSP '97, volume 31 issue 

5 

Publisher: ACM Press 

Full text available: f| | pdf(2.24 M3) Additional Information: full citation, references, citings , 



15 Declarative programming in a prototype-instance system: object-oriented 

M> programming without writing methods 

^ Brad A. Myers, Dario A. Giuse, Brad Vander Zanden 

October 1992 ACM SIGPLAN Notices , conference proceedings on Object-oriented 

programming systems, languages, and applications OOPSLA '92, volume 

27 Issue 10 

Publisher: ACM Press 

Full text available: *M pdf(2.19 MB) Additional Information: full citation, references, citings , index terms 



16 Early as^ Q 
M> requ iremen ts e n g ineerin g 

^ Ruzanna Chitchyan, Awais Rashid, Paul Rayson, Robert Waters 

March 2007 Proceedings of the 6th international conference on Aspect-oriented 

software development AOSD '07 
Publisher: ACM Press 

Full text available: ^pdf?373.53 KB) Additional Information: full citation, abstract, references , index terms 

In this paper, we discuss the limitations of the current syntactic composition mechanisms 
in aspect-oriented requirements engineering (AORE). We highlight that such composition 
mechanisms not only increase coupling between aspects and base concerns but are also 
insufficient to capture the intentionality of the aspect composition. Furthermore, they force 
the requirements engineer to reason about semantic influences and trade-offs among 
aspects from a syntactic perspective. We present a requiremen ... 

Keywords: aspect-oriented requirements engineering, expressive pointcuts, natural 
language processing, requirements composition 



Compiler construction: an advanced course 

F. L. Bauer, F. L. De Remer, M. Griffiths, U. Hill, J. 3. Horning, C. H. A. Koster, W. M. 
McKeeman, P. C. Poole, W. M. Waite, G. Goos, J. Hartmanis 
January 1974 Book 

Publisher: Springer-Verlag New York, Inc. 

Full text available: ^.pdg6M.2M.B) Additional Information: Ml.oitation, abstract references, cited by 

The Advanced Course took place from March 4 to 15, 1974 and was organized by the 
Mathematical Institute of the Technical University of Munich and the Leibniz Computing 
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Center of the Bavarian Academy of Sciences, in co-operation with the European 
Communities, sponsored by the Ministry for Research and Technology of the Federal 
Republic of Germany and by the European Research Office, London. 
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18 Word reordering and a dynamic programming beam search algorithm for statistical Q 

mschinejranslatjpn 

Christoph Tillmann, Hermann Ney 

March 2003 Computational Linguistics, volume 29 issue l 
Publisher: MIT Press 

Full text available: f§ pdf(877.85 KB) Additional Information: Mcitatjon, abstract, referejnces, citing Index 



terms 

In this article, we describe an efficient beam search algorithm for statistical machine 
translation based on dynamic programming (DP). The search algorithm uses the 
translation model presented in Brown et al. (1993). Starting from a DP-based solution to 
the traveling-salesman problem, we present a novel technique to restrict the possible word 
reorderings between source and target language in order to achieve an efficient search 
algorithm. Word reordering restrictions especially useful for the tr ... 

Semiautomatic labelling of semantic features Q 
Arantza Diaz de Ilarraza, Aingeru Mayor, Kepa Sarasola 

August 2002 Proceedings of the 19th international conference on Computational 

linguistics - Volume 1 
Publisher: Association for Computational Linguistics 

Full text available: ||pdf(212,73,KB) Additional Information: full, citation, abstract, references 

This paper presents the strategy and design of a highly efficient semiautomatic method for 
labelling the semantic features of common nouns, using semantic relationships between 
words, and based on the information extracted from an electronic monolingual dictionary. 
The method, that uses genus data, specific relators and synonymy information, obtains an 
accuracy of over 99% and a scope of 68,2% with regard to all the common nouns 
contained in a real corpus of over 1 million words, after the manua ... 

20 .Using.focus.to„genera Q 
Marcia A. Derr, Kathleen R. McKeown 

July 1984 Proceedings of the 22nd annual meeting on Association for Computational 
Linguistics , Proceedings of the 10th international conference on 
Computational linguistics 
Publisher: Association for Computational Linguistics 
Full text available: Wpdfi671.35 KB) 

M Additional Information: full citation, abstract, references, citings 

Pubjlshergite 

One problem for the generation of natural language text is determining when to use a 
sequence of simple sentences and when a single complex one is more appropriate. In this 
paper, we show how focus of attention is one factor that influences this decision and 
describe its implementation in a system that generates explanations for a student advisor 
expert system. The implementation uses tests on functional information such as focus of 
attention within the Prolog definite clause grammar formalism t ... 
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21 Declarative visualization in the shared dataspace paradigm Q 

Gruia-Catalin Roman, Kenneth C. Cox 
^ May 1989 Proceedings of the 11th international conference on Software engineering 
ICSE '89 

Publisher: ACM Press 

Full text available: f|) pdf(1.55 MB) Additional Information: full citation, references, citings , index terms 



22 Annotator: an Al approach to engineering drawing annotation Q 
^ Barbara J. Vivier, Melvin K. Simmons, Sharon A. Masline 

^ June 1988 Proceedings of the 1st international conference on Industrial and 

engineering applications of artificial intelligence and expert systems - 
Volume 1 IEA/AIE '88 

Publisher: ACM Press 

Full text available: ^.pdf(655J.4 KB) Additional Information: Ml .citation, abstract, references, index terms 

Annotator is a prototype to investigate the application of AI techniques to the annotation 
of engineering drawings. In particular, Annotator addresses drawings of piping systems 
such as those for chemical plants or waste treatment facilities. The isometric 
representation of the piping system is selected because it is the most numerous type of 
drawing in plant design. Knowledge contained in hierarchies represents the CAD model of 
the piping system, features of the model and features of the d ... 

23 Translator writing systems Q 
^ Jerome Feldman, David Gries 

^ February 1968 Communications of the ACM, Volume 11 Issue 2 
Publisher: ACM Press 

Full text available: 'g j pdf{4.47 MB) Additional Information: full citation , abstract, references , citings 

A critical review of recent efforts to automate the writing of translators of programming 
languages is presented. The formal study of syntax and its application to translator writing 
are discussed in Section II. Various approaches to automating the postsyntactic (semantic) 
aspects of translator writing are discussed in Section III, and several related topics in 
Section IV. 

Keywords: compiler compiler-compiler, generator, macroprocessor, meta-assembler, 
metacompiler, parser, semantics, syntactic analysis, syntax, syntax-directed, translator, 
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24 Web Clustering, fi lt eri ng a nd a pp li c atio ns : A searc h resul t clusterin g method u sing Q 

informatively named entities 
^ Hiroyuki Toda, Ryoji Kataoka 

November 2005 Proceedings of the 7th annual ACM international workshop on Web 

information and data management WIDM '05 
Publisher: ACM Press 

Full text available- flhaff £1 3 23 KB) Additional Information: full citation, abstract, references, citings, index 
' M ' terms 

Clustering the results of a search helps the user to overview the information returned. In 
this paper, we regard the clustering task as indexing the search results. Here, an index 
means a structured label list that can makes it easier for the user to comprehend the 
labels and search results. To realize this goal, we make three proposals. First is to use 
Named Entity Extraction for term extraction. Second is a new label selecting criterion 
based on importance in the search result and the relation ... 

Keywords: named entity, search result clustering 



Paper s§s^ Q 
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Sara Cohen, Yaron Kanza, Benny Kimeifeld, Yehoshua Sagiv 

October 2005 Proceedings of the 14th ACM international conference on Information 

and knowledge management CIKM '05 
Publisher: ACM Press 

Full text available: jf hxift214.90 KB) Additiona! Information: full citation, abstract, references, citings, Index 
* terms 

A framework for describing semantic relationships among nodes in XML documents is 
presented. In contrast to earlier work, the XML documents may have ID references (i.e., 
they correspond to graphs and not just trees). A specific interconnection semantics in this 
framework can be defined explicitly or derived automatically. The main advantage of 
interconnection semantics is the ability to pose queries on XML data in the style of 
keyword search. Several methods for automatically deriving int ... 

Keywords: XML, interconnection semantics, keyword search 



The complex^ 

Georg Gottlob, Nicola Leone, Francesco Scarcelio 

May 2001 Journal of the ACM (JACM), volume 48 issue 3 

Publisher: ACM Press 

Full text available: *ff l] i pdff 566 .16 KB ) Additional Information: Motion, abstract, .references, cjtjngs, index 
' terms, review 

This paper deals with the evaluation of acyclic Boolean conjunctive queries in relational 
databases. By well-known results of Yannakakis[1981], this problem is solvable in 
polynomial time; its precise complexity, however, has not been pinpointed so far. We 
show that the problem of evaluating acyclic Boolean conjunctive queries is complete for 
LOGCFL, the class of decision problems that are logspace-reducible to a context-free 
language. Since LOGCFL is contained in AC1 and NC2, the eva ... 

Keywords: CSP, LOGCFL, acyclic hypergraph, algorithm, bounded treewidth, conjunctive 
query, constraint, constraint satisfaction problem, database theory, degree of cyclicity, 
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A new polynomial time decidable fragment of first order logic is identified, and a general 
method for using polynomial time inference procedures in knowledge representation 
systems is presented. The results shown in this paper indicate that a nonstandard 
"taxonomic" syntax is essential in constructing natural and powerful polynomial time 
inference procedures. The central role of taxonomic syntax in the polynomial time 
inference procedures provides technical support for the often ... 

Keywords: automated reasoning, inference rules, machine inference, mechanical 
verification, polynomial time algorithms, proof systems, proof theory, theorem proving 



27 Taxonomic syntax for first order inference 
^ David McAllester, Robert Givan 

^ April 1993 Journal of the ACM (JACM), volume 40 issue 2 
Publisher: ACM Press 

Additional Information 



Full text available: m pdf(Z89 MB) 
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28 Graph mining: Laws, generators, and algorithms Q 
jigk Deepayan Chakrabarti, Christos Faloutsos 
^ June 2006 ACM Computing Surveys (CSUR), volume 38 issue l 
Publisher: ACM Press 

Full text available: f§ pdf{910.68 KB) Additional Information: full citation , abstract , references , index terms 

How does the Web look? How could we tell an abnormal social network from a normal 
one? These and similar questions are important in many fields where the data can 
intuitively be cast as a graph; examples range from computer networks to sociology to 
biology and many more. Indeed, any M : N relation in database terminology can be 
represented as a graph. A lot of these questions boil down to the following: "How can we 
generate synthetic but realistic graphs?" To answer thi ... 

Keywords: Generators, graphs, patterns, social networks 



29 A tool for the deterministic scheduling of real-time programs implemented as periodic Q 
^ Ada tasks 

^ E. W. Giering, T. P. Baker 

September 1994 ACM SIGAda Ada Letters , Proceedings of the second international 

symposium on Environments and tools for Ada SETA2, volume xiv issue si 
Publisher: ACM Press 

Full text available: ^pdfl1 t ,57..MB) Additional Information: ML citation, aMrsct, index terms 

In this paper, we describe an experimental tool for the scheduling and execution of real- 
time programs on a single processor. This tool accepts a real-time program implemented 
as a system of periodic tasks written in a subset of Ada. It translates the program into 
equivalent Ada source code in which the task bodies are executed by a run-time 
dispatcher according to a deterministic, cyclic schedule.The schedule is represented as a 
table of scheduling actions describing the execution of the progra ... 

30 Logic and logic programming Q 
^gs^ J. A. Robinson 

" March 1992 Communications of the ACM, volume 35 issue 3 
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Keywords: unification 



31 A query language and optimization techniques for unstructured data Q 
^ Peter Buneman, Susan Davidson, Gerd Hillebrand, Dan Suciu 

^ June 1996 ACM SIGMOD Record , Proceedings of the 1996 ACM SIGMOD international 
conference on Management of data SIGMOD '96, volume 25 issue 2 
Publisher: ACM Press 

Full text available* i Mpdf{ 1 1 9 MB] Additional Information: .M.dtation, absiract, references, citings, index 
. im terms 

A new kind of data model has recently emerged in which the database is not constrained 
by a conventional schema. Systems like ACeDB, which has become very popular with 
biologists, and the recent Tsimmis proposal for data integration organize data in tree-like 
structures whose components can be used equally well to represent sets and tuples. Such 
structures allow great flexibility y in data representation. What query language is 
appropriate for such structures? Here we propose a simple language Un ... 

32 Systemic classification and its efficiency Q 

Chris Brew 

December 1991 Computational Linguistics, Volume 17 Issue 4 
Publisher: MIT Press 
Full text available: 

M P df ( 2 20 m ) W Additional Information: full citation, abstract , references , citings 
Publisher Site 

This paper examines the problem of classifying linguistic objects on the basis of 
information encoded in the system network formalism developed by Halliday. It is shown 
that this problem is NP-hard, and a restriction to the formalism, which renders the 
classification problem soluble in polynomial time, is suggested. An algorithm for the 
unrestricted classification problem, which separates a potentially expensive second stage 
from a more tractable first stage, is then presented. 

33 Integer programming vs. expert systems: an experimentai comparison Q 
Vasant Dhar, Nicky Ranganathan 

" March 1990 Communications of the ACM, volume 33 issue 3 
Publisher: ACM Press 

Full text available: f§ pdf( 1 .46 MB) Additional Information: full citation , abstract , references , citings, index 
^ " terms, review 

Expert system and integer programming formulations of an NP-complete constraint 
satisfaction problem are contrasted in terms of performance, ability to encode complex 
preferences, control of reasoning, and supporting incremental modification of solutions in 
response to changing input data. 

34 Games: Representation of interwoven surfaces in 2 1/2 D drawing Q 
^ Keith Wiley, Lance R. Williams 

^ April 2006 Proceedings of the SIGCHI conference on Human Factors in computing 
systems CHI '06 
Publisher: ACM Press 

Full text available: ^ pdf(960.05 KB) Additional Information: full citation, abstract , references , index terms 
The state-of-the-art in computer drawing programs is based on a number of concepts that 
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are over two decades old. One such concept is the use of layers for ordering the surfaces 
in a drawing from top to bottom. Unfortunately, the use of layers unnecessarily imposes a 
partial ordering on the depths of the surfaces and prevents the user from creating a large 
class of potential drawings, e.g., of Celtic knots and interwoven surfaces. In this paper we 
describe a novel approach which only requires lo ... 

Keywords: braids, computational topology, constraint propagation, drawing programs, 
knot diagrams, layers, surfaces 



35 Symbolic evaluation and the global value graph Q 
^ John H. Reif, Harry R. Lewis 

January 1977 Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles 

of programming languages POPL '77 
Publisher: ACM Press 

Full text available: |§pdf{ 1M MBJ Additional Information: fyjl citation, abstract, references, citings 

This paper is concerned with difficult global flow problems which require the symbolic 
evaluation of programs. We use, as is common in global flow analysis, a model in which 
the expressions computed are specified, but the flow of control is indicated only by a 
directed graph whose nodes are blocks of assignment statements. We show that if such a 
program model is interpreted in the domain of integer arithmetic then many natural global 
flow problems are unsolvable. We then develop a direct (non-it ... 

36 Duery„Mecution.md Q 
query plans 

^ Francesco Scarcello, Gianluigi Greco, Nicola Leone 

June 2004 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART 
symposium on Principles of database systems PODS '04 

Publisher: ACM Press 

Full text available: t gj jpdft217.5S KB) Additional Information: full citation, abstract, references, citings 

Hypertree width [22, 25] is a measure of the degree of cyclicity of hypergraphs. A number 
of relevant problems from different areas, e.g., the evaluation of conjunctive queries in 
database theory or the constraint satisfaction in AI, are tractable when their underlying 
hypergraphs have bounded hypertree width. However, in practical contexts like the 
evaluation of database queries, we have more information besides the structure of 
queries. For instance, we know the number of tuples in relations, ... 



37 



Anaiysisof recursive.sMe.macN Q 
Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas Reps, Mihalis 
Yannakakis 

July 2005 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 27 Issue 4 

Publisher: ACM Press 

Full text available: Mpdf.'SQS.TO KB) Additjonal Information: full dtation, abstract, references, citings, index 
™" *"*""" * terms 

Recursive state machines (RSMs) enhance the power of ordinary state machines by 
allowing vertices to correspond either to ordinary states or to potentially recursive 
invocations of other state machines. RSMs can model the control flow in sequential 
imperative programs containing recursive procedure calls. They can be viewed as a visual 
notation extending Statecharts-like hierarchical state machines, where concurrency is 
disallowed but recursion is allowed. They are also related to various models ... 

Keywords: Software verification, context-free languages, model checking, program 
analysis, pushdown automata, recursive state machines, temporal logic 
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38 On the optimal nesting order for computing N-reiational joins 
^ Toshihide Ibaraki, Tiko Kameda 

W September 1984 ACM Transactions on Database Systems (TODS), volume 9 issue 3 
Publisher: ACM Press 

Full text available' Wi pdfd.39 MB ! Additional Information: full citation, abstract , references , citings, index 
" .terms, .rMew. 

Using the nested loops method, this paper addresses the problem of minimizing the 
number of page fetches necessary to evaluate a given query to a relational database. We 
first propose a data structure whereby the number of page fetches required for query 
evaluation is substantially reduced and then derive a formula for the expected number of 
page fetches. An optimal solution to our problem is the nesting order of relations in the 
evaluation program, which minimizes the number of page fetche ... 

39 Research session 4: data integration & interoperability: Computing cores for data 

id& exchange: new algorithms and practical solutions 
^ Georg Gottlob 

June 2005 Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART 

symposium on Principles of database systems PODS '05 
Publisher: ACM Press 

Full text available: ^ pdf(239,1Q KB) Additional Information: Ml cit a t i on , a bstract , references, citings 

Data Exchange is the problem of inserting data structured under a source schema into a 
target schema of different structure (possibly with integrity constraints), while reflecting 
the source data as accurately as possible. We study computational issues related to data 
exchange in the setting of Fagin, Kolaitis, and Popa(PODS'03). We use the technique of 
hypertree decompositions to derive improved algorithms for computing the core of a 
relational instance with labeled nulls, a problem we show to ... 



40 Solution space navigation for geometric constraint systems 

Meera Sitharam, Adam Arbree, Yong Zhou, Naganandhini Kohareswaran 

^ April 2006 ACM Transactions on Graphics (TOG), volume 25 issue 2 
Publisher: ACM Press 

Full text available: | §pdf( 446.25 KB) Additional Information: full citation , abstract, references, index terms 

We study the well documented problem of systematically navigating the potentially 
exponentially many roots or realizations of well-constrained, variational geometric 
constraint systems. We give a scalable method called the Equation and Solution Manager 
(ESM) that can be used both for automatic searches and visual, user-driven searches for 
desired realizations. The method incrementally assembles the desired solution of the entire 
system and avoids combinatorial explosion by offering the user a vi ... 

Keywords: Root selection for geometric constraint systems, conceptual design, constraint 
graphs, cyclical and 3D geometric constraint systems, decomposition of geometric 
constraint systems, degree of freedom analysis, feature-based and assembly modeling, 
underconstrained and overconstrained systems, variational geometric constraint solving, 
well constrained systems 
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